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Abstract 

We present a quantum algorithm that verifies a product of two nxn matrices over any integral 
domain with bounded error in worst-case time 0(n 5//3 ) and expected time 0(n 5 / 3 / min (u>, V?t) 1//3 ), 
where w is the number of wrong entries. This improves the previous best algorithm |ABH + f)2] 
that runs in time 0(n 7//4 ). We also present a quantum matrix multiplication algorithm that is 
efficient when the result has few nonzero entries. 

1 Introduction 

The computational complexity of matrix multiplication is a subject of extensive study. Matrix 
multiplication is the central algorithmic part of many applications like for example solving linear 
systems of equations and computing the transitive closure. A fast algorithm for matrix multiplica- 
tion thus implies a fast algorithm for a variety of computational tasks. Strassen Str69j was the first 
to show that surprisingly two nxn matrices can be multiplied in time n 2+a for a < 1. His result 
was improved by many subsequent papers. The best known bound to date is an algorithm with 
a ~ 0.376 by Coppersmith and Winnograd CW90J. It is a main open problem to determine the 
true value of a. Freivalds showed |Fre79j that verifying whether the product of two nxn matrices 
is equal to a third can be done with high probability in time proportional to n 2 . We will refer to 
this latter problem as matrix verification. 

We study the computational complexity of matrix multiplication and verification on a quantum 
computer. The first to study matrix verification in the quantum mechanical setting were Ambainis, 
Buhrman, Hoyer, Karpinski, and Kurur |ABH + 02] who used a clever recursive version of Grover's 
algorithm to verify whether two nxn matrices equal a third in time 0(n 7 / 4 ), thereby improving 
the optimal classical bound of Freivalds. 

In this paper we will construct a bounded error quantum algorithm for the matrix verification 
problem that runs in time 0(n 5//3 ). Suppose we are verifying whether A x B = C . When the 
number of "wrong" entries in C is w, our algorithm runs in expected time 0(n 5 / 3 / min(u;, -^/n) 1 / 3 ). 
For w = y/n we have a matching lower bound. 

Our algorithm uses the quantum random walk formalism by Szegedy Szc04j that he developed 
as a generalization of the quantum random walk technique of Ambainis |Amb 04j . Ambainis used 
a quantum random walk to obtain an optimal quantum algorithm for the element distinctness 
problem. If one were to adapt that method directly to the setting of matrix verification, one does 
obtain a 0(n 5 / 3 ) algorithm in terms of queries to the input. However that algorithm still requires 
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0(n 2 ) time, because it computes several times a matrix product of sub-matrices that are loaded 
into the memory. This costs no additional queries, but it will take additional time. The rest of the 
paper is devoted to improve the time complexity of the quantum algorithm to 0(n 5 / 3 ). 

We perform a quantum random walk on the product of two Johnson graphs, analyze its spectral 
gap, that is the second smallest eigenvalue of its Laplacian, and estimate that in our setting enough 
of the nodes are marked if A x B ^ C. See Section |31 for a detailed description of our algorithm. 
We next introduce a combinatorial tool to analyze the behavior of our algorithm when many of 
the entries are wrong. Finally we use our fast quantum matrix verification algorithm as a building 
block to construct a quantum algorithm for computing the actual matrix product A x B that is 
substantially faster than any known classical method, when there are not too many non-zero entries 
in the final product. 

2 Preliminaries 

2.1 Quantum query complexity 

We assume familiarity with quantum computing NCOOj and sketch the model of quantum query 
complexity. Suppose we want to compute some function /. For input x S {0, 1}^, a query gives 
us access to the input bits. It corresponds to the unitary transformation 

O : \i,b,z) i ► \i,b (B X{, z). 

Here i G {1, . . . , N} and b E {0, 1}; the z-part corresponds to the workspace, which is not affected 
by the query. We assume the input can be accessed only via such queries. A i-query quantum 
algorithm has the form A = UtOUt-i • • • OU\OUq, where the £/& are fixed unitary transformations, 
independent of x. This A depends on x via the t applications of O. The algorithm starts in the 
initial state |0 fc ) and its output is the result of measuring a dedicated part of the final state ^4|0 fc ), 
where k is the total amount of space used by the algorithm. 

2.2 Quantum search 

One of the most interesting quantum algorithms is Grover's search algorithm Gro96, BBHT98J. 
It can find an index of an input bit Xi in an n-bit input such that Xi = 1 in expected number of 
0(\/ n /(l x l + 1)) queries, where \x\ is the Hamming weight (number of ones) in the input. Grover's 
algorithm can be cast in more general terms as amplitude amplification: given a quantum algorithm 
A that accepts with probability p, then it can be amplified to have constant success probability 
with Wljp iterations of A. 

Given n numbers x%, . . . , x n as input, the element distinctness problem is the task to determine 
whether there are two distinct indices % and j such that Xi = Xj. Ambainis in a very nice paper 
|Amb04| applied quantum random walks in a novel way and constructed a quantum algorithm that 
solves element distinctness in 0(n 2 / 3 ) queries. This algorithm is faster than the algorithm BD H + 0l] 
which is based on amplitude amplification and uses 0(n 3//4 ) queries. Ambainis's method was 
generalized by Szegedy |Sze04j for all graphs and even for all symmetric Markov chains (with non- 
uniform transition probabilities). Szegedy 's method can be regarded as a quantum walk version of 
amplitude amplification BHMT02 . 
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2.3 Previous best algorithm for verification of matrix products 

Ambainis et al. |ABH + 02] discovered a quantum algorithm running in time 0(n 7//4 ). Since it was 
never published, we will briefly sketch it here. Let A, B,C be n x n matrices. First, partition the 
matrices B and C into \fri blocks of \fn columns each. It holds that AB = C iff AB{ = Ci for every 
i, where Bi and Ci are the sub- matrices of size n x y/n. The verification of AB^ = Gj can be done 
with bounded error in time 0(n 3 / 2 ) as follows: choose a random vector x of length ^ ] /n, multiply 
both sides of the equation by x from the right side, compute classically y = Bix and z = CiX, and 
verify the matrix-vector product Ay = z by a Grover search. The search over n rows takes 0(y / n) 
iterations and a verification of one row takes time n. Now, we apply amplitude amplification on 
the top of this sub-routine Vi, and compute the And of all y/n blocks using n 1 / 4 calls to V{. 

2.4 Notation 

Let [n] denote the set {1,2, . . . , n}. Let A nxm denote a matrix A of dimension nxm. Let A T 
denote the transpose of A. For ai?C [n], let A\r denote the \R\ X m sub-matrix of A restricted 
to the rows from R. Analogously, for every S C [m], let denote the n x |5| sub-matrix of A 
restricted to the columns from S. Let A(M) denote the spectral norm of a matrix M; it is equal to 
the largest eigenvalue of M for symmetric M. For a set S, let ( , ) denote all subsets of S of size k. 
An integral domain is a commutative ring with identity and no divisors of 0. 

For a graph G, let Vq denote the vertices of G and let Eg denote the edges of G. The normalized 
Laplacian matrix C(G) of an undirected graph G is a symmetric \Vg\ x \Vg\ matrix defined by 
£ij(G) = 1 if i = j, it is —l/^Jdidj if i ^ j, and otherwise. A spectral gap of a graph G, often 
called the Fiedler value of G, equals to the second smallest eigenvalue of C(G); it is nonzero if 
G is connected. The Johnson graph J(n, k) is defined as follows: its vertices are subsets of [n] of 
size k, and two vertices are connected iff they differ in exactly one number. Let G = G\ x G2 
denote the graph categorical product of two graphs Gi, G2, defined as follows: Vq = Vg x x Vq 2 , and 
((51,52), (g'1,9'2)) G E G iff {91, g[) G E Gl and (g 2 ,g' 2 ) G E G2 . 

3 Algorithm for verification of matrix products 

Let A,B,C be n x n matrices over any integral domain. A verification of a matrix product is 
deciding whether AB = C. We construct an efficient quantum walk algorithm for this problem. It 
is described in Figure ^ and its expected running time is analyzed in Sections 0] and El 

The basic outline is the following: Verify Once estimates the scalar product of the superposition 
computed by the quantum walk and the uniform superposition. The inequality a^-bs ^ c^s can 
only be true when A\r ■ B\ s 7^ G|^, that is when G|^ contains at least one wrong entry. If AB = C, 
then the quantum walk does nothing, because the phase flip is never performed and the diffusion 
on a uniform superposition is equal to identity. Hence the superposition computed by the quantum 
walk stays uniform, and the measurement of \z) always yields 0. On the other hand, if AB ^ C 
and k is sufficiently large, then for £ drawn uniformly from {1, 2, . . . , k}, with high probability, the 
quantum walk converges to a superposition almost orthogonal to the uniform superposition and 
the measurement of \z) yields 1 with probability close to ^. The loop in Product Verification tries 
a sequence of exponentially increasing k. The idea of multiplying the matrices in Verify Once from 
both sides by random vectors p, q is explained in Section f4. II It allows us to achieve both a better 
running time and smaller space complexity. 
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Product Verification (input size n, matrices A,B,C) returns 1 when AB / C: 

1. Take any 1 < A < |, for example A = y|. 

2. For i = 0, 1, . . . , log A (n 2 / 3 ) + 9, repeat 16 times the following: 

• Run Verify Once (\^8 ■ X 1 ). 

• If it returns 1, then return "not equal". 

3. Return "equal". 

Verify Once (number of rows k) returns 1 when AB / C is detected: 

4. Pick the number of iterations t uniformly at random from {1, 2, . . . , k}. 
Pick a random row vector p and a random column vector q of length n. 

5. Initialization. Put the quantum register into superposition R c[ n ] ^2sc[ n ] \R)\S). 

\R\=k \S\=k 

(Think of R as a subset of the rows of A and S as a subset of the columns of B.) 
Compute &r = p\ R ■ A\r, h$ = B\ s -q|s, and cr^s = v\ R " C\r " <l\s m ti me 2kn + k 2 . 
Let \z) = |+) = ^Jl 1 ^ . The quantum state is now 

|+) ^l^afl)|5,b s )|cfl,s). 

6. Quantum walk. Conditioned on \z), perform £ iterations of the following: 

(a) Phase flip. Multiply the quantum phase by — 1 iff a^ • bg / cr^s- The scalar 
product is verified in time n using no queries. 

(b) Diffusion. Perform one step of quantum walk on (R, S), that is exchange one 
row and one column. The update of slr, hs, and cr^s costs 2n queries to A, 
2n queries to B, and 4fc queries to C. 

7. Apply the Hadamard transform on \z), measure it, and return the outcome. 



Figure 1: Quantum algorithm for verification of matrix products 
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4 Analysis of the algorithm 



In this section, we prove that Product Verification has one-sided bounded error and estimate 
its expected running time depending on the set of wrong entries. We use a recent result by 
Szegedy |Sze0 4| . which can be regarded as a quantum walk version of quantum amplitude am- 
plification BHMT02J. Its proof is outlined in Appendix 1X1 

Theorem 1 (Szegedy [Sze04j) Let G be an undirected graph on vertex set X, and let 5g be the 

spectral gap of G. Some set of vertices M C X are marked with the promise that \M\ is either zero 
or at least e\X\. For every m = 0,(1/\/5gs), the following quantum algorithm decides whether M 
is non-empty with one-sided error 7 < | in time 0(Tx + m ■ (Tm + Tq)): 

1. Initialization. Compute a uniform superposition over X ; let the time be Tx- 

2. Pick 1 < I < m uniformly at random. Repeat I times the following: 

(1) Phase flip. Flip the quantum phase if an element is marked; let the time be Tm- 

(2) Diffusion. Perform one step of quantum walk on G; let the time be Tq. 

3. Estimate the scalar product of the quantum walk distribution and the uniform distribution. 
4.1 Analysis of Product Verification 

We analyze the expected running time of the algorithm as follows. Let Verify Full denote a modified 
version of Verify Once that does not use the random vectors p, q, but instead reads all the sub- 
matrices into memory and verifies A\r • B\ s = in the phase- flip step (JHK)- Verify Full has 
the same query complexity as Verify Once, but its space complexity and running time are bigger. 
(Although the phase-flip step (jHK) costs no additional queries, the time needed to compute classically 
j4|ij-S| 5 is at least kn, whereas the scalar product a^-bg can be computed in time re. 1 ) We analyze 
the error of Verify Full, because the multiplication by p, q complicates the analysis. For example, 
if there is exactly one wrong entry and the multiplication is over GF(2), then with probability f , 
the wrong entry is completely hidden by multiplication by zero. However, we prove the following 
statement: 

Lemma 2 Let AB / C . The probability that Verify Once (-^8 • k) outputs 1 at least once in 16 
independent trials, each time with new random p, q, is bigger than the success probability of one 
call to Verify Full (A;). 

Using this lemma, it is sufficient to analyze the error of the algorithm as if it is performing 
Verify Full in each step. Let W = \ (AB — C)i j 7^ 0} be the set of wrong entries of the matrix 

product and let R,SC. [n] denote subsets of rows of A and columns of B. We mark (R, S) iff C||j 
contains a wrong entry, formally ^4|k- i?! 5, 7^ C|^, or equivalently W n R x S 7^ 0. The performance 
of the algorithm depends on the fraction of marked pairs e(W, k) = Pr^s^i?, S) is marked], where 
\R\ = \S\ = k. In Section we prove the following lower bound on e(W, k): 



It seems that this slowdown "time ^> ^queries" is a typical property of algorithms using quantum walks: the 
quantum algorithm for element distinctness |Amb04| needs to use random hash functions to remedy it, and it is open 
whether triangle finding MSS05| can be improved this way. 
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Lemma 3 Let q(W) = max(\W'\,mm(\W\,^/n)), where W is the largest independent subset of 
W , that is it contains at most one 1 in every row and column. For every W and k < n 2 / 3 /q(W) 1 ^ 3 , 
it holds that e(W, k) = U(^q(W)). 

We also need the following two statements, whose proofs are in Appendix 1X1 
Lemma 4 Let 1 < k < §. The spectral gap of G = J(n,k) x J(n,k) is 5q = Q(l/k). 

Lemma 5 Let \(p) and \tp) be quantum states, let \X) = -^(|0, ip) + |1, ?/')), and let \Y) = (H 
I)\X). If the first qubit of\Y) is measured in the computational basis, then Pr[Y = 1] = ^(1— ((p\i^)). 

Theorem 6 Product Verification always returns "equal" if AB = C . If AB ^ C , then it returns 
"not equal" with probability at least |. Its worst-case running time is 0(n 5//3 ), its expected running 
time is 0{n 5 / 3 /q(W)^ 3 ), and its space complexity is O(n). 

Proof. By Lemma |^1 if we replace the calls to Verify Once by Verify Full in Figure ^ an d skip 
repeating each loop 16 times and multiplication of k by \/8, the success probability is decreased. 
Hence if we compute an upper bound on the expected number of iterations of such an algorithm, 
it will also hold for the original algorithm. Let us thus analyze the running time of the original 
algorithm assuming the error analysis is of Verify Full. Verify Once walks I quantum steps on the 
graph categorical product of two Johnson graphs G = J(n, k) x J(n, k). The marked vertices of G 
correspond to marked pairs (R, S), that is the pairs such that A\r ■ B\ s ^ C|^. The initialization 
costs time Tx = O(kn), a phase flip costs time Tm = n, and one step of the quantum walk costs 
time Tq = 4n + 4k = O(n). The running time of Verify Once is thus 0((k + £)n) = 0(kn). The 
scalar product of two distributions is estimated using Lemma |SJ 

Let W 7^ 0. By Theorem ^ Verify Once recognizes a wrong matrix product with bounded error 
for every m > 0(1/ y/5 G e(W,k)). Plug in e{W,k) = Sl(£q(W)) by Lemma and 5 G = 9(i) 
by Lemma El We get that m > 0(n/ \/kq(W)). In our algorithm, we use m = k, which gives 
the condition k > k$ = 0(n 2 / 3 /g(VF) 1//3 ). Hence for every k > ko, Verify Once makes only small 
one-sided error. The algorithm Product Verification does not know q(W) and ko, but it instead 
tries different values of k from the exponentially increasing sequence 1 ... A 4 . 

The total running time is dominated by the last run. The expected running time can be written 
as a telescopic sum E[T] = YltLo t ' Prp -1 = t] = P r P^ ^ t]. Product Verification (PV) calls 

Verify Once with time kn = X l n and each call after k > ko fails with probability 7 < |, hence 

(log A fc )-l (log A n 2 / 3 )+9 

E[T] = Vn • Pr[PV enters the i-th loop] < A*n + A * n ' 7i ~ l0gA ^ 

i i=0 i=log A fco 

< O(k n) (l + £( A 7)'^ = 0(A:on) = O 

because A7 < | • | = 1. The probability that a wrong product is never recognized is < 7 9 < ^, 
where 9 is the number of loops after n 2 / 3 . 

PV never makes an error when AB = C. In this case, the phase flip is equal to the identity 
operation. The diffusion is also equal to the identity on the uniform distribution, hence the whole 
quantum walk in Verify Once does nothing and the qubit \z) = |+) is untouched. Finally, PV 
always terminates when k > A 9 n 2//3 , hence its total running time is 0(n 5//3 ). □ 
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It remains to prove Lemma |21 Let us fix random vectors p, q. We call (R, S) revealing iff 
SLR-b s = (p\ R ■ A\ R ) ■ (B\ s ■ q| 5 ) ^ c R)S , which is equivalent to p\ R ■ (A\ R - B\ s ) ■ q\ s + p| R • C\ S R • q| s 
due to the associativity of matrix multiplication. As we have already seen, not every marked pair is 
revealing. Let ( P:Cl (W,k) = Pr^s^i?, S) is revealing] denote the fraction of revealing pairs, where 

= IS"! = k. The proof of the following statement is in Appendix iBl 

Lemma 7 Let p, q be picked uniformly at random. Then Pr[^ Pj q(VF, k) > k)] > g. 

Now, we show that the constant probability of picking good random vectors is compensated by 
a constant number of repetitions. 

Proof of LemmaEl By LemmaEl the success probability of Verify Once is at least |, where p is the 
success probability of Verify Once given that it guesses good vectors p, q with k) > g£(W, k). 
By Theorem ^ and the proof of Theorem |HJ p = 1 — 7 and ^ > p > | for every k > ky = 
0(n 2//3 /(g(iy)/8) 1 / 3 ); the factor | in e(W, k) is compensated by taking v/8-times bigger k. The 
success probability of 16 independent trials is at least 1 — (1 — |) 16 > 1 — (e _p ) 2 > 1 — (1 — 0.64p) 2 > 
1.28p - OAp 2 > p, because 1 - x < e~ x , e~ x < 1 - 0.64x for x G [0, 1], and p < 0.7. □ 

4.2 Comparison with other quantum walk search algorithms 

Product Verification resembles a few other algorithms. The first quantum algorithm of this type 
was the quantum walk algorithm for element distinctness |Amb04j . The same technique was subse- 
quently successfully applied to triangle finding MSSQ5] and group commutativity testing MN05J. 
Both algorithms walk on the Johnson graph J(n,k). The analysis of Ambainis [Amb04| relies on 
the fact that the quantum state stays in a constant-dimensional subspace. This constraint is sat- 
isfied if there is at most one solution; then the subsets can be divided into a constant number of 
cases. In the non-promise version, the number of cases is, however, not constant. The latter pa- 
pers |Ambf)41 IMSS05| solve the non-promise case by projecting the input into a random subspace. 
With high probability, there is exactly one solution in the subspace; this trick originally comes 
from Valiant and Vazirani |VV86j . Since it is not known whether this technique can be used in 
more than one dimension, we solve the non-promise version of product verification using the more 
general quantum walk by Szegedy Sze04j instead of the original one by Ambainis Amb04 . 

Theorem n is quite general, because it allows walking on an arbitrary undirected graph. On the 
other hand, the algorithm Verify Once obtained by it is a bit slower than the original Ambainis 
walk Amb04, MSS05 . First, Verify Once only solves the decision version of the problem and it 
does not find the actual position of a wrong entry. This can be resolved by a binary search. Second, 
Verify Once does the phase flip after every step of quantum walk instead of doing it once per block 
of steps. However, for both element distinctness and product verification, the additional cost is 
subsumed by the cost of the quantum walk. 

5 Lower bounds on the fraction of marked pairs 

In this section, we try to solve the following combinatorial problem: 

Problem. Given annxn Boolean matrix W and two integers 1 < r, s < n, what is the probability 
e(W, r, s) that a random r x s sub-matrix of W contains a 1? Or, equivalently: Given a bipartite 
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graph on n, n vertices, what is the probability that a randomly chosen induced subgraph with r, s 
vertices contains at least one edge? 

It is simple to prove that e(W, r, s) is monotone in all its three parameters. As we have seen in 
Theorem H3 the expected running time of Product Verification depends on the fraction of marked 
pairs, which is e(W, k, k), also denoted there by e(W, k). 2 Let us compute e when W contains exactly 
one 1: e(W,r,s) = ("Zi )( "-1 )/(")(") = With tnis bound, monotonicity, and Theorem© we 
conclude that Product Verification finds the correct answer with bounded error in time 0(n 5 / 3 ). 
The rest of this section contains a more detailed analysis of the expected running time of the 
algorithm for larger W. Unfortunately, we are only able to prove weak lower bounds on e for 
general W. However, if one improves them, then we automatically get an improved upper bound 
on the running time of the same algorithm. 

Henceforth, let r, s be sufficiently small. The average probability over all sets W with t ones is 
E^.| H /| =t [£(H / , r, s)] = Q(\W\^) (Lemma|SJ). We are able to prove the same bound for all \W\ < \Jn 
(Lemma [SJ), when W is an independent set, that is it does not contain two ones in the same row 
or column (Lemma HUj) . or when the ones in W form a rectangle (again Lemma ^J). However, the 
latter rectangle bound only holds for a limited class of r,s, which does not include the balanced 
case r = s = k in the range used by our algorithm. As a consequence, if the ones in W form a full 
row or a column, our algorithm is slower than what would follow from this formula. We, however, 
show that in this case our algorithm is optimal (Theorem II lj) ; this is the only known tight lower 
bound for our algorithm. Most of the proofs are postponed to Appendix O 

Lemma 8 Let rs < Then E W:lwl=t [e(W, r, s)] = Q(\W\^). 

Lemma 9 Let w be the number of nonzero rows ofW, and let w' be maximal the number of nonzero 
entries in a row. Then for every r < ^ and s < A, e(W,r, s) = f2(|W|^f). 

Lemma 10 Let W have at most one entry in every row and column. Then for every r, s satisfying 
rs < n^' 3 /\W\ 2 ' 3 , e(W,r,s) = Sl(\W\%). 

The main Lemma El is a direct corollary of Lemmas and 

Proof of Lemma |3J Lemma El implies that e(W,k) = min(|W|, y/n)): First, assume that 
\W\ < ^pri and verify the restrictions on r = s = k. For every t < ^/n it holds that n 2 / 3 /i 1 / 3 < n/t. 
Hence if \ W\ < y^n, then for every k < n 2 / 3 /\W\ 1/3 it holds that k < n/\W\ and, since w, w' < \W\, 
also k < I and k < A. Hence the lower bound e(W,k) = il(K\W\) given by Lemma |^1 holds 
for every k in the range required by Lemma |21 Now, if \W\ > y/n, the bound follows from the 
monotonicity of e(W, k) in W. 

LemmallUlsavs that e(W, k) = Sl{K\W'\) for every independent W and k in the range required 
by Lemma 01 The bound on W follows from the monotonicity of e(W, k) in W. If we put these two 
bounds together, we obtain that e(W,k) = £l(-%q(W)), as desired. □ 

The bound cannot be strengthened to e(W, k) = Vt{^ I \W\) for general W and full range of k. We 
show that no quantum algorithm can be fast if the n ones in W form a full row. A straightforward 
calculation shows that q(W) for this W can be at most y/n if we want the bound on e to hold for 
all k < 0{n 2 l 3 /q{W) l l 3 ). 

2 Our algorithm only tries balanced choices r = s = k. Since the initialization costs 0((r + s)n), setting one of the 
variables smaller does not decrease the query complexity, but it decreases the success probability of Verify Once. 
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Theorem 11 Any bounded- error quantum algorithm distinguishing a correct matrix product and 
a matrix product with one wrong row has query complexity f2(n 3 / 2 ). 

Proof. We reduce Or of n parities of length n + 1 to product verification. Let 

Z = • • • © £Cl,„ © yi) V ... V (X^l © ... © X n:n © y n ). 

Using the quantum adversary lower bound method Amb02j, it follows that computing z requires 
f2(n 3 / 2 ) quantum queries, and the lower bounds holds even if we promise that at most one parity 
is equal to 1. Since z = 1 iff 3i : yi ^ (B™=i x i t> we can reduce this problem to the verification of 
the matrix product AB = C over GF(2), where Aij = Xij, Bij = 1, and Cj j = yi. The promise is 
transformed into that at most one row is wrong. □ 



6 Concluding remarks 

6.1 Algorithm for computation of matrix products 

Let m > ra 2 / 3 . One can modify the algorithm to verify the product A nxm B mxn = C nxn in time 
proportional to n 2//3 m. The quantum walk stays the same and only the inner scalar products are 
of length m instead of n. 

Using the rectangular product verification algorithm and binary search, one can construct a 
quantum algorithm that outputs the position of a wrong entry. By iterating this and correcting 
the wrong entries, one can compute the matrix product AB = C whenever a good approximation 
to C is known. One can always start by guessing C = 0, hence the following bound holds: 

Theorem 12 Let m > n 2 / 3 . The matrix product A nxm B mxn = C nxn can be computed with 
polynomially small error probability in expected time 

mlogn • n 2//3 u> 2//3 ; for 1 < w < \fn, 
Tm < 0(1) • ^ m log n ■ y/nw] \fn < w < n, (1) 

mlogn ■ n^/w\ n<w<n 2 , 



where w = \ W\ is the number of nonzero entries of C. 

The algorithm and its analysis are presented in Appendix El Let us neglect the logarithmic 
term. It follows that matrix products with \W\ = o(y / n) non-zero entries can be computed in 
sub-quadratic time o(nm). We can also compare our algorithm to the best classical algorithms, 
however this comparison cannot be fair, since our algorithm depends on \W\, whereas all known 
classical algorithms depend on the sparseness of the input matrices. The fastest known algorithm 
for dense square matrices jCW90| works in time 0(n 2,376 ). Our algorithm can beat it when the 
number of nonzero elements of the result is \W\ = o(n a876 ). The fastest known algorithm for 
dense rectangular matrices Cop97| works in time O(n 1 ' 844+O ^ 1 ^m ' 533 + n 2+ °^). The fastest known 



algorithm for sparse square matrices |YZ04| works in time 0(n L2 z°' 7 + n 2+ °^), where A and B 
have at most z non-zero elements. 
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6.2 Boolean matrices 

The algorithm Verify Once relies on the fact that arithmetical operations are over some integral 
domain. If the matrices are over the Boolean algebra {V,&}, then the multiplication by random 
vectors from both sides does not work. However, Boolean matrix products can be verified even 
faster by the following algorithm: 

Theorem 13 There exists a quantum Boolean-matrix product verification algorithm running in 
time 0(riy/m) and space 0(logn + logm). 

Proof. The condition that three given matrices form a valid product can be written as an And-Or 
tree: And of n 2 equalities, each being an Or of m products. There is a bounded-error quantum 
algorithm HMW03J running in time 0(V n 2 m) = 0(ny/m) and space 0(log(n 2 m)). □ 

By standard techniques BBHT98J, one can speed up the verification to time 0(ny/m/t), if the 
number of wrong entries t is known beforehand. If t is unknown, then the verification can be done 
in expected time 0(ny/m/t) and the worst-case time stays 0{n\/rn). The Boolean matrix product 
with t nonzero entries can be thus computed in expected time 0(ny/tm). 



6.3 Open problems 

It would be interesting to strengthen the lower bound on the fraction e(W, k) of marked pairs and 
thus also the upper bound on product verification. As we have shown, this cannot be done in full 
generality, but perhaps one can show a stronger lower bound using some density argument. 

The time complexity of our algorithm goes up if the space complexity is bounded. Can one 
prove a time-spa ce tradeo ff for the verification problem similar to the tradeoff for computation of 
matrix products |KSW04j ? Note that we currently can't show time-space tradeoffs for any decision 
problem. 

Can one prove a better lower bound on verification of matrix products than f2(n 3 / 2 )? This 
lower bound is tight when there are y/n wrong entries. Is the true bound higher with only one 
wrong entry? Due to the small certificate complexity of this problem, one cannot prove such a 



bound using any of the adversary methods [SSQ51 ■ but it might be provable by the polynomial 
method IB BC+Olj . 
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A Proofs for the quantum walk 

Proof of Theorem ^ This is a corollary of Lemma 7 from Sze04j. To express the lower bound 
on m in terms of Sg,£, we use several other statements from that paper: Let P = C(G) be the 
Laplacian of G and let Pm be obtained from P by leaving out all rows and columns indexed by 
some x £ M. By |Sze041 Lemma 10], X(Pm) < 1 — ^gs/2. The lower bound on m can be restated 



as n(y iA | PM) ) = n(l/y/S^e) like in |Sze()4L Corollary 2]. □ 

Proof of Lemma |1J It is not difficult to show that the spectral gap of the Johnson graph 
J(n, k) is i n ™ k \ k , which is 0(l/k) for 1 < k < ^. Furthermore, it is simple to prove that 5gixG 2 = 
min((5Gi, <5g 2 )- We conclude that 5q = 0(1/ k). □ 

Proof of Lemma \Y) = ±(|0,^) + \l,<p)) + ±(|0,V>> - |1,V)) = |0)^4^ + l 1 ) M l M I hence 
Pr[ y = i] = ||M^> ||2 = i m _ {m<p) _ m = i m<p) + {m _ 2{</>m = i (1 _ {(pm D 



B Proofs for the fraction of revealing pairs 



Lemma 14 Let G be an integral domain with g elements, let (R, S) be marked, and let p, q be 
picked uniformly at random from G n . The probability that (R, S) is revealing is > (1 — |) 2 > |. 

Proof. Let D = AB — C, that is the wrong entries are exactly the nonzero entries of D. Assume 
that (R, S) is marked and pick any Di j ^ 0. Now, (R, S) is not revealing iff 



= Yl Pi^'Aj = Yl Ptty'Aj = Pio(<JioAo,io + c i) + caqjo + c 3, 

i<=R,jeS i£R,j£S 

where ci,C2,C3 are some constants depending on D and other coordinates of p, q. Fix these 
constants and pick pj , q J0 at random from G. Since G is an integral domain with g elements, 
p = Pr[qj Di j +c\ = 0] < K For every q JO such that qj Di j + c\ ^ 0, the equality is equivalent 
to C4Pi + C5 = for another constants C4 7^ and C5, which is again satisfied by at most 1 value 
of pi £ G. Hence the probability of having equality by chance is at most 

\ 1 1 ^ 5-119-1 gg-1 
p-l+(l—p)-- = -+ p- < - H 7T- = n • 

9 9 9 9 9 2 9 2 

The probability that (R, S) is revealing is thus at least 1 — = (1 — |) 2 > \ and the equality 
holds when g = 2. □ 
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Lemma 15 Let < X < 1 and E[X] > a. Then Pi[X > 0}> a-0. 
Proof. Decompose the expected value of X conditioned on X > 0: 

E[X] = E[X\X < 0} ■ (1 - Prpf > 0}) + E[X\X > (3} ■ Pi[X > 0]. 

Rearrange, plug in < X < 1, and obtain Pr[X > 0] = ^[x\x\p^ \ x\x<p\ > T^o = a ~ P- D 

Proof of Lemma [3 Consider Boolean random variables V^s^q = 1 iff (R, S, p, q) is revealing. 
Let u P) q be the fraction of marked sets that are also revealing when p, q multiply the equation, 
formally u p>q = E marked (H)S ) [VR,s, P ,q]- By LemmaUU for every marked (R, S), E PtCl [VR : s,p,q\ > \. It 
follows that E marked(RiS) [E Pi q[V RiSiPi q]] > | and hence E P) q ^^^(^5) [VR^^q]] = E Piq [u Piq ] > \. 
By Lemma IT31 when p, q is picked uniformly at random, Pr[f Pjq > |] > |. Hence in this lucky 
case, Cp,q(W,&) > \e{W,k). □ 



C Proofs for the fraction of marked pairs 

Proof of Lemma |8j Consider Boolean random variables Vr^s,w = 1 iff W Pi R x S (/). Then 
for every \R\ = r and |5| = s, it holds that E^.i^i^IVrjS^] = Pr^VF n R x 5 / 0] and 

rTT7 „ „ . rtl . n 2 -rs n 2 -rs-l n 2 — rs — t + 1 

Pr^H^naxs^] = i— ^ ^ re2 _ t + 1 

i-i'!^iV = i-fi-!4V>i-e-^ = or^ 



re 2 / V re 2 

1 —A 

because 1 — x < e~ x and, on any fixed interval x G [0, A], also e~ x < 1 ^ — x. The claim is now 

proved using standard arguments. Since VR,S : Evj/fV^^jy] > t^, also E^^E^ly^s^]] > t^. 
Exchange the order of summation and obtain E]y [Er^s[Vr : s,w]] = E]y[e(W, r, s)] > ip-. □ 

Proof of Lemma |H1 Let Z denote the random event "W n R x S ^ 0" . For j = 0, 1, . . . , w, let 

Zj denote the random event "W n R x [n] has exactly j nonempty rows". Since {Zj} are disjoint 
and X^7=of >r [-^j] = 1) we can decompose the probability 

w w 

Pr[Z] = £ ^[Z I ^] • Pr[^] > £ Pr[Z,] • Pr[Z | Z x ] = (1 - Pr[Z ]) ■ Pr[Z | Zx], 
j=0 j=l 

because Pr[Z|Zj] > Pr[Z|Z x ] for j > 1. Now, Pr[Z ] = Pr[W n i? X [n] = 0] = 2r« . 
n=!£=l . . . n-u,-r+i < (S=«)r = (1 - H)r < e -™ because 1 - a? < e _!B . Recall that for every 

n— 1 n— r+1 — \ n ' \ Jl' — ' — J 

-A 



x e [0, A], e~ x < 1 - i^— x. If r < a then ^ < 1 and hence e" < 1 - kzerw = l _ a rw f 
a = 1 - e~ l . We conclude that 1 - Pr[Z ] > a^. 

To lower-bound the other term, we decompose Z\. For i = 1, 2, . . . , n, let 1^ denote the random 
event "Wni2 x [n] has the i-th row nonempty and all other rows are empty". Let Wi be the number 
of entries in the i-ih row of W and let w' = maxj Wi. Since {Yi} are disjoint and Y\ U . . . U Y n = Z\, 

Pr[Z|Zi]= V Pv[Z\Y i \-Pv[Y i \Z 1 ] = - V PrfZl^]. 
— ' w z — ' 
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Pr[Z | Yi] is easy to evaluate, since the i-th row of W contains exactly Wi entries and S is picked 
uniformly at random. Let Wi = W D R x [n] be the i-th row of W. By the same arguments as 
above, Pr[Z | Yi] = PrfWifl [n] X S ^ | |Wj| = Wi] = 1-PrfWin [re] x S = | \Wi\ = Wi ] > l-e~ — . 

sw i 

Analogously, if s < ^, then e~~ < 1- a^- and Pt[Z \ Yi] > a^-. Plug both bounds together 
and obtain Pr{Z] > a^E™,^"? = « 2 ^£,,^o^ = "(TO), as desired. □ 

Proof of Lemma 1101 If t < -y/re, then the result follows from Lemma EH Let us assume that 
t > \fn. Again, let Z denote the random event U W H R x S ^ 0" and, for j = 0, 1, . . . , r, let 
denote the random event "W n i? x [re] has exactly j nonempty rows" . Then 

l-Pr{Z\Z j ] = Pr{WnRxS = <!>\Z j ) = ^. n - S - 1 ... n - S -J + 1 <(^y < e~l 
L 1 JJ L 1 JJ re n-1 n-j + l ~\ n J ~ 

Since j < r and t > -^/re, we get that sj < rs < re 4 / 3 /t 2 / 3 < re 4 / 3 /(^/re) 2 / 3 = re. Hence ^ < 1 and 
by upper-bounding the exponential we get that Pr[Z | Zj] > 1 — e _ ~« > 1 — (1 — a^-) = for 
a = 1 — e . Now, {^} are disjoint and S^=o Pr [^j] = hence we can decompose the probability 

r r . r 

pt[z] = I H ■ Pr ^'] > E a f Pr ^ = a& - E j • Pr ^-] = a ^ ■ E ^ 

i=o j=o j=o 

where Y is the number of nonempty rows. There are r rows among re in R and we pick t entries 
without returning uniformly at random. An application of E[Y] = ^ completes the proof. □ 



D Computation of matrix products 

In this section, we show how to use (the rectangular version of) Product Verification to obtain the 
actual position of a wrong entry. Furthermore, we present an algorithm for computation of matrix 
products. The algorithms are described in Figure |U 

Theorem 16 Find Wrong Entry has one-sided polynomially small error, worst-case running time 
0(re 2//3 rrelogre), and expected running time 0(n 2 / 3 mlogn/q(W) 1 / 3 ) for the set of wrong entries W. 

Proof. Assume that AB ^ C. Let be the set of wrong entries in the t-th recursion level of 
the binary search. From the definition of q(W), if q(W e ) = q, then q{Wfj) > f for at least one 
quadrant i,j £ {1,2}. Find Wrong Entry descends into the first quadrant it finds a solution in, 
hence it chooses W e+1 = W/ • with high probability and then (~ ) 2 / q(W e+1 ) < n 2 /q(W e ). There 
are log re levels of the recursion. Hence its expected running time is at most 

log ™ l(n/2 e ) 2 log ™ Hrf~ n 2 / 3 m '° gn / n 2 / 3 m \ 

as claimed. By Theorem H3 the probability that a wrong matrix product is not recognized in one 
iteration is at most jr. The probability that it is not recognized in O (log re) iterations is 1 /poly (re). 
If AB = C, then the first iteration of binary search is repeated O (log re) times and the worst-case 
running time is O (n 2 / 3 rre log n). □ 
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Matrix Multiplication (input size n,m, matrices A nxm , B mxn ) returns C nxn = AB: 

1. Initialize C = 0. 

2. Run Find Wrong Entry (n,m,A,B,C). 
If it returns "equal" , return C. 

3. Otherwise let (r, c) be the wrong position. Recompute C r>c . 

Find and recompute all wrong entries in the r-th row using the Grover Search. 
Find and recompute all wrong entries in the c-th column using the Grover Search. 

4. Go to step [21 

Find Wrong Entry (input size n,m, matrices A nxm , B mxn , C nxn ) 
returns a position (r, c) if C r>c ^ ^ A r ^Bi :C or "equal" if AB = C: 

1. If n = 1, verify the scalar product and exit. 

2. Let A\,A2 denote the top and bottom half of A, 
let B\, £>2 denote the left and right half of B, and 

let Ci ; i, C\ t 2, 02,1, C2,2 denote the four quadrants of C. 

3. Repeat at most O(logn) times the following step: 

• Run in parallel Product Verification (§,rn, Ai, Bj,Cij) for i,j £ {1,2}. 

If some of them returns "not equal", stop the other threads of computation 
and cancel the loop. 

4. If the product verification was always successful, return "equal". 

5. Let Cij ^ AiBj be the found wrong sub-matrix. 
Let (r',c') = Find Wrong Entry (^,m,Ai,Bj,Ci t j). 

6. If i = 1, set r = r', otherwise set r = r' + S. 
If j = 1, set c = c', otherwise set c = d + \. 
Return (r, c) . 



Figure 2: Quantum algorithm for computation of matrix products 
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Remark. It might be that the position of the wrong entry can be obtained from just one run 
of Product Verification in the same way as in the quantum walk algorithm for element distinct- 
ness |Amb04j - by measuring the subsets R, S instead of the quantum coin register \z). However, 
this is only known to follow from Theorem^for exactly one wrong entry, that is \W\ = 1 jSze04, 
Section 10]. The log-factor in the total running time is necessary for polynomially small error. 

Now we can prove the upper bound on matrix multiplication. 

Proof of Theorem I12L Finding all ri wrong entries in the £-th row is done by the Grover search 
with unknown number of solutions [BBHT98], and it takes time Ya=i \/j m = Q{^/ nr i m )i where 



the scalar products of length m are computed on-line. We ensure that there are no wrong entries 
left with probability polynomially close to one in additional time 0(^/nm log n). Let us condition 
the rest of the analysis by that the Grover searches indeed find all ones. 

Let W be the largest independent subset of W. Clearly, Matrix Multiplication finishes in at 
most \W'\ iterations, otherwise there would exist an independent set larger than \W'\. The total 
running time is the sum of the time spent in Find Wrong Entry 



?><E 



\ W '\ 2/3 i 



n 



IWl 1 / 3 



0((n|iy / |) 2/3 mlogn), 



and the time spent in the Grover searches. By applying a Cauchy-Schwarz inequality several times, 

\W'\ /\W'\ \W'\ 

Tq < (\/nr£in + yjncim) logra = ra^/n log n 1 • ^Jr} + 1 • ^fcl 
l=\ \ i=\ i=\ 



\W'\ 



\W'\ 



\W'\ 



\ = 0(mV^\ogny / \W T \y / \W\). 



\ x \ E ^ + \ 

The algorithm is bounded-error, because both Find Wrong Entry and the iterated Grover searches 
have polynomially small error. Put the bounds together and obtain: 

T M = T F + T G < mlogn^y^W 1 ]- (n l ^\W'\ 1 ^ + 

Evaluate separately the three cases \W\ £ [l,i/n], \W\ G [y^, n], and \W\ G [n, n ], use that 
\W'\ < \W\ and \W'\ < n, and obtain inequality (^P), which we had to prove. □ 
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